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Abstract 

We initiate the study of property testing of submodularity on the boolean hypercube. Sub- 
modular functions come up in a variety of applications in combinatorial optimization. For a vast 
range of algorithms, the existence of an oracle to a submodular function is assumed. But how 
does one check if this oracle indeed represents a submodular function? 

Consider a function / : {0, 1}" — > M. The distance to submodularity is the minimum fraction 
of values of / that need to be modified to make / submodular. If this distance is more than e > 0, 
then we say that / is e-far from being submodular. The aim is to have an efficient procedure that, 
given input / that is e-far from being submodular, certifies that / is not submodular. We analyze 
a very natural tester for this problem, and prove that it runs in subexponential time. This gives 
the first non-trivial tester for submodularity. On the other hand, we prove an interesting lower 
bound (that is, unfortunately, quite far from the upper bound) suggesting that this tester cannot 
be very efficient in terms of e. This involves non-trivial examples of functions which are far from 
submodular and yet do not exhibit too many local violations. 

We also provide some constructions indicating the difficulty in designing a tester for submod- 
ularity. We construct a partial function defined on exponentially many points that cannot be 
extended to a submodular function, but any strict subset of these values can be extended to a 
submodular function. 



1 Introduction 



Submodular functions have been studied in great depth in combinatorial optimization [Edm7CH 
INWF781 IFNW781 ILov83| lF7a97l ISchOOl IFFI01] . A set function 2 U -> R is submodular if V5, TCU, 
f(S U T) + f(S n T) < f(S) + fiT). An alternative and equivalent view of submodularity is the 
monotonicity of marginal values. For all S C T and elements i ^ T, a submodular function satisfies 
/(5 U {t}) - /(5) > f(T U {t}) - f(T). We will think of / as a function in {0, l} n -»■ R. 

These functions are often used in many algorithmic applications and very naturally show up 
when modeling utilities. It is quite common to assume that algorithms have oracle access to some 
submodular function: given a set S, we have access to f(S). Observe that, in general, the description 
of the submodular function / has size that is exponential in n, whereas most algorithms that use / 
run in polynomial time. This means that these algorithms look at a very tiny fraction of /, yet their 
behavior depends on a very global property of /. This leads to the very natural question: what if 
the function / provided to the algorithm was not submodular? Could the algorithm detect this, or 
would it get fooled? Obviously, if / is constructed by taking a submodular function and making 
very few changes to the values, then there is no need to think that algorithms should be affected. 
On the other hand, if / is "significantly different" from a submodular function, the behavior of these 
algorithms could very different. 

Let us formally explain the notion of being different from a submodular function. Since polyno- 
mial time algorithms are sublinear with respect to the size of /, it is natural to use some property 
testing terminology. A function / is e-far from being submodular if / needs to be changed at an 
e-fraction of values to make it submodular. In polynomial time, can we detect that such a function 
is not submodular? If this is not possible, then this raises some very fundamental questions about 
submodularity. If the plethora of algorithms used cannot tell whether their input / is submodular 
or not, then in what sense are they actually using the submodularity of /? This would suggest that 
the algorithms exploit a property more general than submodularity. It would be strange if we expect 
input functions / to have a property (submodularity), but we cannot even check if these functions 
deviate significantly from submodularity. 

The main question here is whether submodularity is testable, i.e, is there a polynomial time 
procedure that distinguishes submodular functions from those that are e-far? (This question was 
first posed as an open problem in [PRR03 , in the context of submodularity testing over grids. 
Their results focused on testing over large low-dimensional grids rather than the high-dimensional 
hypercube {0, l} n .) More concretely, what are the kind of structural properties of submodular- 
ity that we need to address? Property testing algorithms, especially those for functions on the 
hypercube, usually check for some local property. These algorithms check if the desired property 
holds in a small local neighborhood, for some randomly chosen neighborhoods. If no deviation is 
detected, then property testers conclude that the input function is close to the property. Do similar 
statements hold for submodularity? We show non-trivial upper and lower bounds connecting local 
submodularity violations to the distance. 

Property testing proofs often show that a function is close to a property by explicitly modifying 
the function to make it have the property. Usually, there is some procedural method to perform 
this conversion. This raises a very interesting question about partial submodular functions: suppose 
one is given a partial function over the hypercube. This means that some set of values is defined, 
but the remaining are left undefined. Under what circumstances can this be completed into a 
submodular function? If this cannot be completed, can we provide a small certificate of this? For 
a vast majority of natural testable properties (over functions on the hypercube, e.g. monotonicity) 
such small certificates do exist. Unfortunately, this is no longer true for submodularity. We present 
an example showing that a minimal certificate of non-extendability can be exponentially large. 
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1.1 Our results 



Before we state our main theorems, we first set some notation. 

Definition 1.1 Denote by ej £ {0, l} n the canonical basis vector which has 1 in the i-th coordinate 
and everywhere else. 

For a function f : {0, l} n — > R, i £ [n] and x £ {0, l} n such that Xi = 0, we define the marginal 
value of i (or discrete derivative) at x as dif(x) = f(x + ej) — f{x). 

A function f is submodular, if for any i £ [n] and x,y £ {0, 1}™ such that Xi = yi = and x < y 
coordinate-wise, dif(x) > dif(y). 

The distance d(f,g) between two functions f and g is the fraction of points x where f(x) ^ g(x). 
Let S be the set of all submodular functions. The distance of / to submodularity is mm g ^s d(f,g). 
We say f is e-far from being submodular if the distance of f to submodularity is more than e. 

Definition 1.2 A property tester for submodularity is an algorithm with the following properties. 

• If f is submodular, then the algorithm answers YES with probability 10. 

• If f is e-far from submodular, then the algorithm answers NO with probability at least 2/3. 

• The number of queries made to f is sublinear in the domain size, which is 2 n . (Ideally, the 
number of queries is polynomial in n and 1/e.) 



Submodularity vs. monotonicity. Our first observation is that testing submodularity is at 
least as hard as testing monotonicity. More formally, the problem of testing monotonicity for a 
function / : {0, l} n — > R can be reduced to the problem of testing submodularity for a function 
/' : {0, l} ra+1 — > R. We present this reduction in Section [5j 

A consequence of this is that known lower bounds for monotonicity testing apply also to sub- 
modularity testing. For example, it is known that a non-adaptive monotonicity tester requires at 



least £l{y/n) queries FLN + 02| . We remark that the best known monotonicity tester on {0, l} n takes 



0{n 2 /e) queries (DOL+99j and is non-adaptive. 

Submodularity can be naturally viewed as "second-degree monotonicity", i.e. monotonicity of 
the discrete partial derivatives d-J . So a very natural test for submodularity is to simply run 
a monotonicity tester on the functions dif . In one direction, it is clear that for a submodular 
function, such a tester would always accept. However, it is not clear whether this tester would 
recognize functions that are far from being submodular and label them as such. 

Monotonicity testers search randomly for pairs x,x + ej such that f(x) > f{x + ej). Such 
a pair of points can be naturally called a "violated pair". It is known that if / is e-far from 
being monotone, then the fraction of violated pairs is at least e/n°W jOGL+ODi lDOL+99] . If 



we want to test submodularity by reducing to a monotonicity tester in each direction, this means 
that we are looking for violations of the following type: x £ {0, 1}™ such that = and 

f(x + e.j) — f(x) < f{x + ej + ej) — f(x + e^). We call such violations violated squares. 

Definition 1.3 We call {x, x + ej, x + e^-, x + ej + e^} a square. This is called a violated square, 
if f( x ) + f( x + e i + e j) > f( x + e i) + f( x + e j)- The density of violated squares is the number of 
violated squares divided by (™)2 n ~ 2 . 

Our main combinatorial result consists of two bounds on the relationship of the distance from 
submodularity and the density of violated squares. 



x We are actually dealing with one-sided testers here. If we allowed a probability of error for this case, that would 
be a two-sided tester. 
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Theorem 1.4 Let n be a sufficiently large integer. 

• Let e £ (0,e~ 5 ). For any function f : {0, l} n — > R £/ia£ is e-/ar /ram freing submodular, the 
density of violated squares is at least e°(v / " 1 °g™), 

• For any e > 2- n / 10 , there is a function f : {0, l} n — >• E which is e-far from being submodular 
and its density of violated squares is less than e 4,8 . 

The first part of the theorem is proven through relatively basic observations. The second part 
is quite technical and requires a much deeper understanding of submodularity. 

Theorem 11.41 provides evidence that testing submodularity is very different from testing mono- 
tonicity. An intuition one might get from monotonicity testing is that if a natural extension to 
submodularity exists, its dependence on e should be relatively mild, perhaps linear or quadratic. 
We show that this is not the case, in particular if the dependence is a polynomial in 1/e, the degree 
of the polynomial would have to be at least 5. This holds even in the range of exponentially small 
e = 2 _e ( n ), which means that poly(n)/e 4 " 8 queries for any polynomial in n are not enough. This 
might be interpreted as counterintuitive to the notion that the dependence is polynomial at all. 
However, we cannot currently push this construction any further. 

The first part of Theorem 11.41 implies immediately that a submodularity tester that checks 
q = l/e c, (v / "i°s™) random squares succeeds with high probability. Note that this is a non-adaptive 
tester, because the queries do not depend on the function values. To our knowledge, this is the first 
testing result asymptotically better than the trivial tester checking 2®( n ) squares. 

Corollary 1.5 There is a subexponential time non-adaptive tester for submodularity. This proce- 
dure samples l/e°(v / ™ lo s n ) sqaures at random and checks if any are violated. If the input f is e-far 
from being submodular, this procedure rejects with high probability. 

Extending partial functions. A partial function f is one that is defined on only some subset 
of the hypercube. Such a function is extendable, if the remaining values can be filled in to get a 
submodular function. Although the question of extending partial functions is interesting in itself, it 
also has some relevance to question of testing submodularity. 

Any proof of a property tester must show that if a function / passes the tester (with high 
probability), then / must be e-close to submodularity. This is usually done by arguing that if / has 
a sufficiently low density of local violations, one can modify an e-fraction of values and remove all 
"obstructions" to submodularity. Since an / that passes the tester must have a low density of local 
violations, / is e-close. An understanding of these obstructions to submodularity is often helpful 
for designing testers. An obstruction is just a subset of values that cannot exist in any submodular 
function. 

Given a partial function / that is not extendable, we would ideally like to find a small certificate 
for this property. Unfortunately, we will show that such certificates can be exponentially large. 
We give a partial function with a surprising property. The partial function / is defined on an 
exponentially large set and is not extendable. If any single value is removed, then this new function 
is extendable. 

Definition 1.6 For a partial function f, let def(f) be the set of domain points when f is defined. 
Let A C {0, 1}™. The restriction of / to A, f\_A, is the partial function that agrees with f on A and is 
undefined everywhere else. The partial function f is minimally non-extendable if f\j\_ is extendable 
for all AC def(f). 

Theorem 1.7 There exists a minimally non- extendable function f such that \def(f)\ = 2^ n \ 
2 We use "high probability" to refer to probability > 2/3. 
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1.2 The difficulty in testing submodularity 

The values of / can interact in non-trivial ways to create obstructions to submodularity. Contrast 
this to monotonicity. A partial function / (on the hypercube) cannot be extended to a non-decreasing 
monotone function iff there is a pair of sets S C T such that f(S) > f(T). There is always a 
certificate of size 2 that a partial function cannot be extended. So this completely characterizes 
the obstructions to monotonicity, and is indeed one of the reasons why monotonicity testers work. 
Our work implies that such a simple characterization does not exist for submodularity. Indeed, as 
Theorem 11.71 claims, obstructions to submodularity can have an extremely complicated structure. 

Functions that are far from being submodular can "hide" their bad behavior. In Theorem 13.31 
we show the existence of a function / with exactly one violated square, but making / submodular 
requires changing 2 n / 2 values. Somehow, even though the function is (in a weak sense) "far" from 
submodular, the only local violation that manifests itself is a single square. The functions described 
by the second part of Theorem 11.41 are constructed through generalizations of this example. 

1.3 Previous work 

Property testing, which was defined in |RS96} [GGR98]. is a well-studied field of theoretical computer 
science. Efficient testers have been given for a wide variety of combinatorial, algebraic, and geometric 
problems (see surveys [FisOll lGo!981 iRonOlj ). The problem of property testing for monotonicity 
over the hypercube has been studied in [GGL + 00[ lDGL+99[ lFLN+021 iFisMl IFR] IBCGSMKl j. In 
particular, monotonicity of a function over {0, l} n can be tested using 0(n 2 /e) non-adaptive queries 
[DGL+99] and Q(y/n) queries are necessary |FLN+02| . 

As mentioned earlier, the problem of testing submodularity was first raised first by [PRR03J. 
They considered submodularity over general grid structures (of which the hypercube is a special 
case). Their focus was on testing submodularity over 2-dimensional grids. Specifically, [PRR03J 
gave strong results for testing Monge matrices. Monge matrices are essentially submodular functions 
over the n x m integer grid. Here, the dimension is 2, but the domain in each component is 
large. In contrast, we are studying submodular functions over high-dimensional domains, where 
each component is binary. Hence, our problem is quite orthogonal to testing Mongeness, and we 
need a different set of techniques. 

Another related set of results is recent work on learning and approximating submodular functions 
[GHIM09, BH09J. Here, we want to examine a value oracle through polynomially many queries 
(which is similar to our setting) and learn sufficient information so that we are able to answer 
queries about the function. The difference is that in this model, we care about multiplicative- factor 
approximation to the original function. An even more essential difference is that the input function 
is guaranteed to be submodular, rather than possibly being corrupted. For example, [GHIM09] 
shows that we can "learn" a monotone submodular function using polynomially many queries so 
that afterwards we can answer value queries within a multiplicative 0(y/n) factor, and this is optimal 
up to logarithmic factors. In contrast, the input function in our model might be masquerading as a 
submodular function but in truth be very far from being submodular. 

1.4 Organization 

The rest of the paper is organized as follows. In Section [21 we present our basic submodularity tester 
and prove the first part of Theorem 11.41 In Section [3J we present our construction of submodular 
functions from lattices and prove the second part of Theorem 11.41 In Section we discuss extend- 
ability of submodular functions and prove Theorem II .71 In Section El we present the reduction from 
monotonicity testing to submodularity testing. In Section [6l we discuss future directions. 
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2 A subexponential submodularity tester 



The violated-square tester. 

• For a parameter g£Z, repeate the following q times. 

• Sample uniformly at random x £ {0, l} n and i, j E {£ : x^ = 0}. If 

fix) + f{x + ei + ej) > f(x + e;) + f(x + ej), 

i.e. if {x, x + e^ x + ej, x + e, + e,,} is a violated square, then return NO. 

• If none of the tested squares is violated, then return YES. 

Clearly, if the input function is submodular, the tester answers YES. We would like to understand 
how well this tester performs in case the input function is e-far from being submodular. The following 
observation is standard and reduces this question to a combinatorial problem about violated squares. 

Lemma 2.1 The following two statements are equivalent: 

• The violated-square tester using q(n, e) queries detects every function that is e-far from sub- 
modular with constant probability. 

• For every function which is e-far from submodular, the density of violated squares is Cl(l/q(n, e)). 

Therefore, to understand this tester we need to understand the relationship between the distance 
from submodularity and the density of violated squares. In the rest of this section, our main goal is 
to prove the first part of Theorem 11.41 i.e. the claim that for a function e-far from submodular, the 
density of violated squares must be at least e (v / "i°s™)_ Using Lemma [2. Q this implies Corollary II .51 
First, we prove the following lemma. 

Lemma 2.2 Assume {x,x + ej,x + ej,x + ej + e^} is a violated square. Then it is possible to 
decrease all the values either in {y : y < x} or in {y : y > x + ej + e^} by a constant such that the 
square ~^X ■ X I 6^ j X | 6 j 3 X | I Gj ^" is no longer violated and no new violated square is created. 

Proof: Denote by d = f(x) + /(x + ej + e^) — f(x + ej) — /(x + e^) the "deficit" of the violated 
square. One way to fix this square is to decrease the value of f(x) by d; however, this might create 
new violated squares. Instead, we decrease the value of f(y) for every y < x; i.e., we define a 
new function f(y) = f(y) — d for y < x, and f(y) = f(y) otherwise. (Alternatively, we can define 
f(y) = f(y) — d for y > x + e{ + ej, and f(y) = f(y) otherwise; the analysis is symmetric and we 
omit this case.) 

Consider any other square that was previously not violated, i.e. f[x') + f(x' + + eji) < 
f(x' + ej/) + f[x' + e^v). Note that x'^ = x'-, = 0. We consider four cases: 

• If x^ > X£ for some coordinate £, then we do not modify any value in the square {x', x' + 
ej/, x' + ej', x' + ej/ + e^/}. 

• If x' < x and both Xj/ = and Xji = 0, then the only value we modify in the square is fix'), 
which is decreased by d. This cannot create a submodularity violation. 

• If x' < x and exactly one of the coordinates <j X j f IS 1, then we modify two values in the 
square; for example fix') and fix' + ej/). Since we decrease both by the same amount, this 
again cannot create a submodularity violation. 

• If x' < x and Xj/ = xy = 1, then we decrease all four values in the square by the same amount. 
Again, this cannot create a submodularity violation. □ 



5 



This means we can fix violated squares one by one, and the number of violated squares decreases 
by one every time. The cost we pay for each fix is the number of points in the cube above or 
below the respective square. Recall that we count the number of modified values overall, and hence 
what counts is the union of all the cubes modified in the process. Intuitively, it is more frugal to 
choose up-closed cubes for violated squares that are above the middle layer of the hypercube, and 
down-closed cubes for squares that are below the middle. A counting argument gives the following. 

Lemma 2.3 Let e G (0, e -5 ) and let f have at most e V™ lo & n 2 n violated squares. Then these violated 
squares can be fixed by modifying at most e2 n values. 

Proof: Denote by B the set of bottom points for the violated squares which are below the middle 
layer; i.e. we have ||x||i < n/2 for each x £ B. (The squares above the middle layer can be handled 
symmetrically.) We choose to modify the down-closed cube, C x = {y G {0, l} n : y < x}, for each 
x G B. We can fix the violated square one by one, by modifying values in the cubes C x . The total 
number of modified values is | Uxes^l' We estimate the cardinality of this union by combining 
two simple bounds across levels of the hypercube. Denote Lj = {x G {0, l} n : ||x||i = j}. We have 

n/2 

|U^] = E]U(^ n ^ 

xeB j=0 xeB 



First, by the union bound, we have 



\J(C x nLj)\ <^2\c x nLj\ = ]T 

xeB xeB xeB 



Fill 
j 



< \B\ 



n/2 



Secondly, we have (trivially) 



U(c x nLj) 



xeB 



We choose the better of the two bounds depending on j. In particular, for j < n/2 — a^fn, we 
get E"=o~ av/ "(™) = 2 n Pr[X < n/2 - a^/H] < 2 n e~ a2 where X is a binomial Bi(n,l/2) ran- 
dom variable and the last inequality is a standard Chernoff bound. For j > n/2 — a^/n, we use 
E k j= n/2-aVTi\ B \Q = \ B \ ^ \ B \ kaVE < \B\n a ^. We conclude that 

n/2 

| U C *\ =E| \J(C x fM,)\<2 n e- a2 + \B\n a ^. 

xS-B j=0 xeB 

Let a = |ln(l/e); we also assume that \B\ < 2 n e^~ llnn . For e G (0,e -5 ), this implies 

I cJ < 2 n e~^ ln<y1 ^^ 2 + 2 n e v/ ™ lnn n5 v/ ™ ln ^ 1 /^ = ( e 3 ln ( 1 / e ) + 6 §v / " ln ")2 n < -e2 n . 



xeB 



□ 



This lemma immediately implies the first part of Theorem II .41 Assuming that / is e-far from 
being submodular, we get that the number of violated squares is at least e ^ lo ^ n 2 n for e G (0,e~ 5 ), 
i.e. the density of violated squares is at least e ^ logn . 
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3 Few violated squares, yet large distance 



We now give a construction of submodular functions that have large distance but a relatively small 
fraction of violated squares. As we mentioned earlier, these bounds are nowhere near our positive 
results. Nonetheless, we are able to show a significant difference from monotonicity. 

Our first tool to construct these functions is an interesting family of submodular functions. It 
is known that that the set of minimizers of a submodular function always forms a latticfS |Edm70j . 
We prove that conversely, for any lattice C C {0, l} n there is a submodular function whose set of 
minimizers is exactly C. We will then piece together these submodular functions to construct a 
non-submodular function with the desired properties. 

3.1 Submodular functions from lattices 

Lemma 3.1 Let C C {0, 1}" be a lattice, i.e a set of points closed under coordinate-wise minimum 
and maximum. Then the following Hamming distance function is submodular: 

dc{x) = mrn — y||i- 

Proof: [Lemma 13.1] In this proof, we use the set-function notation and identify {0, l} n with subsets 
of [n]. A lattice C C {0,1}™ is a family of sets closed under taking unions and intersections. The 
distance function d can be written as 

d(S) = minlSALl 

where |SAL| denotes the symmetric difference. Assume that d(S) = \SAU\ and d(T) = |TAV| for 
some U, V G C. We want to prove d(S U T) + d(S flT) < d(S) + d(T). We prove in fact that 

\{S U T)A(U UV)\ + \(SD T)A(U nv)\< \SAU\ + \TAV\ 

which is sufficient since U U V, U f] V G £by the lattice property, and therefore d(S U T) < 
\(SUT)A(UUV)\,d{SnT) < \(SnT)A(UnV)\. These two symmetric differences can be bounded 
as follows: 

\(SUT)A{UUV)\ = \{SUT)\(UUV)\ + \(UUV)\(SUT)\ 

= \snur\V\ + \sr\Tnur\V\ + \ur\Snf\ + \unvr\Snf\ 

< \snunv\ + \snTnv\ + \unsnf\ + \unvnf\, 

\{SnT)A(unv)\ = \(S n T) \ (u n v)\ + \(u nv)\ (S nr)| 

= \sr\Tnv\ + \sr\Tnur\V\ + \unvnf\ + \unvnSnT\ 

< \snTnv\ + \snunv\ + \u nv nf\ + \unSr\T\. 

Adding up the two bounds and merging terms such as \Sn Uf] V\ + \ Sn Uf] V\ = \Sf] U\, we obtain 

\{Sut)A{uuv)\ + \(SnT)A(unv)\ < \sn u\ + |rn v\ + \un s\ + \v nf \ = \SAU\ + \tav\. 

□ 



Considering the known fact that the minimizers of any submodular function form a lattice, we 
get the following characterization. 

3 A lattice is any partial order with the operations of "meet" and "join". In our setting, this means a subset of 
{0, l} n closed under taking coordinate-wise minimum and maximum. Or equivalently, a family of sets closed under 
taking intersections and unions. 
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Corollary 3.2 Let S C {0, 1}^. Then the following statements are equivalent: 

1. S is a lattice. 

2. S is the set of minimizers of some submodular function. 

3. The Hamming distance function ds{x) = min^gs ||x — y\\\ is submodular. 

3.2 Functions with one violated square 

We start with the following counter- intuitive result. 

Theorem 3.3 For any n, there is a function f : {0, l} n — > R which has exactly one violated square 
but 2 n / 2 values must be modified to make it submodular. 

We remark that this statement is tight in the sense that for any function with exactly one violated 
square, it is sufficient to modify 2™/ 2 values (we leave the proof as an exercise, using Lemma l2.2p . 
To prove Theorem 13.31 we use Lemma 13.11 which says that any lattice in {0, l} n yields a natural 
submodular function. This function does not have any violated squares. However, we will add two 
additional dimensions and extend the function in such a way that each point of the lattice will 
produce exactly one violated square. Moreover, due to the nature of the distance function, the 
function we construct will be a linear function in a large neighborhood of each violated square. This 
will imply that we cannot simply change one value in each violated square if we want to make the 
function submodular - such changes would propagate and force many other values to be changed as 
well. We make this argument precise later. The construction is as follows. 

Construction. Given: Lattice C C {0, l} n . Output: Function / : {0, l} n+2 — > R. 

• We denote the arguments of / by (a, b, x) where x S {0, 1}™ and a,b E {0, 1}. 

• Let /(0,0,x) = \\x\lt = Ya=i x i- 

• Let /(l, l,x) = 1 - ||x||i = 1 - Ya=i x i- 

• Let /(0, l,x) = f(l,0,x) = dc(x), the Hamming distance function from C 

Lemma 3.4 The function f(a,b,x) constructed above has exactly \C\ violated squares, of the form 
{(0,0, x), (0,1, x), (1,0, x), (1,1, x)} for each x £ C. 

Proof: Observe that for any fixed a, b G {0, 1}, /(a, b, x) is a submodular function of x. Therefore, 
there is no violated square {z, z + ej, z + e^, z + ej + ej} unless at least one of i, j is a special bit. 

If exactly one of i, j is a special bit, we can assume that it is the first special bit. First assume the 
other special bit is 0, therefore we are looking at a square with values /(0, 0, x), /(l, 0, x), /(0, 0, x + 
e^ /(l, 0, x + ej). By construction, we know that /(0, 0, x + ej) — /(0, 0, x) = 1 and /(l, 0, x + ej) — 
f(l,Q,x) = dc(x + ej) — dc{x) < 1, therefore the square cannot be violated. Similarly, if the other 
special bit is 1, we are looking at a square with values /(0, 1, x), /(l, 1, x), /(0, 1, x + ej, /(l, 1, x + ej). 
Here, we always have /(l, 1, x + ej) — /(l, 1, x) = —1, and /(0, 1, x + ej) — /(0, 1, x) = dc(x + ej) — 
dc(x) > —1. So again, the square cannot be violated. 

Finally, consider a square where i, j are exactly the special bits. The square has values f(0, 0, x), 
/(0, 1, x), /(l, 0, x), /(l, 1, x). Observe that /(0, 0, x) + /(l, 1, x) = 1, and /(0, 1, x) + /(l, 0, x) = 
2dc(x). The square is violated if and only if 2dc(x) < 1, i.e. when x G C. This means that we have 
a one-to-one correspondence between violated squares and the points of the lattice. □ 

Thus we can generate functions with a prescribed number of violated squares, depending on our 
initial lattice C The simplest example is generated by C = {x} being a 1-point lattice. In this 
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case, it is easy to verify directly that the function dc(x) is submodular, and hence our construction 
produces exactly one violated square. 

The second part of our argument, however, should be that such a function is not very close to 
submodular. In particular, consider C = {x} where ||x||i = n/2. Suppose that we want to modify 
some values so that the function / becomes submodular. We certainly have to modify at least one 
value in the violated square {(a, b,x) : a,b G {0, 1}}. However, for each fixed choice of a, b G {0, 1}, 
the function f(a,b,x) is linear. The last point in our argument is that it is impossible to modify 
a small number of values "in the middle" of a linear function (with many values both above and 
below), so that the resulting function is submodular. First, we prove the following. 

Lemma 3.5 Suppose f : {0, l} n — > R is a submodular function and /(0) > 0. Then there are at 
least 2 n ~ 1 points x G {0, l} n such that f(x) / 0. 

Note that this is tight, for example by taking /(x) = 1 — x\. 

Proof: We prove the statement by induction on n. Obviously it is true for n = 1. For n > 1, we 
partition the cube {0, l} n as follows: let 

Qi = {x G {0, l} n : xi = . . . = Xi_i =0,Xi = 1}. 

In other words, Qi is the set of points such that the first nonzero coordinate is Xj. We have 
{0, l} n = {0} U ULi Qi- Now consider a submodular function / : {0, l} n -»• R such that /(0) > 0. 
We consider two cases. 

If there is coordinate i such that /(e^) < 0, then the discrete derivative <9j/(0) is negative. By 
submodularity, dif must be negative everywhere. Hence, for any point x such that Xj = 0, at least 
one of /(x), /(x + ej) is nonzero. 

The other case is that /(ej) > for all i G [n]. Then we apply the inductive hypothesis to Qi, 
which implies that at least \\Q%\ values in Qi are nonzero. By adding up the contributions from 
Qi, . . . , Q n , we conclude that at least half of all the values in {0, l} n are nonzero. □ 

To rephrase the lemma, we can start with a zero function on {0, 1}™, increase the value of /(0) 
to a positive value, and ask - how many other values do we have to modify to make the function 
submodular? The lemma says that at least 2™~ 1 values must be modified. In fact, the condition of 
submodularity does not change under the addition of a linear function, so the zero function can be 
replaced by any linear function. Thus the lemma says that it is impossible to increase the value of 
a linear function at the lowest point of a cube, without changing a lot of other values in the cube. 

Note that it is possible to decrease the value of a linear function at the lowest point of a cube 
and this does not create any violation of submodularity. What is impossible is to decrease the value 
"in the middle" of a linear function, without changing a lot of other values. This is the content of 
the next lemma. 

Lemma 3.6 Suppose n is even, f : {0, l} n — > R is a submodular function and f(x) < for some 
\\x\\i = n/2. Then there are at least 2 n / 2 points x G {0, l} n such that /(x) / 0. 

This lemma is also tight, by taking f(y) = — 1 whenever y < x and f(y) = otherwise. 

Proof: Consider Q = {y G {0, l} n : y < x}; this is a cube of dimension n/2, hence \Q\ = 2 n / 2 . 
If f(y) / f° r all y G Q, we are done. Therefore, assume that there is any point y G Q such that 
f(y) = 0- Then consider a monotone path from y to x; there must be an edge (y' , y' + ej) of negative 
marginal value. By submodularity, all edges (z',z' + e.j) for z' > y' must have negative marginal 
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value. There are at least 2 n / 2 such edges, since all the n/2 zero bits in x are also zero in y' and can 
be increased arbitrarily to obtain a point z' > y' . Each of these (disjoint) edges (z' , z' + ej) contains 
a point of nonzero value, and hence there are at least 2 n / 2 such points. □ 

Now we can complete the proof of Theorem 13.31 

Proof: [Theorem 13. 3| Consider the function / : {0, l} n + 2 — > R defined for a 1-point lattice C = {x}, 
\\x\\i = n/2. By Lemma [3 .4} / has exactly one violated square. Note that for each fixed a, b £ {0, 1}, 
the function f(a,b,x) is linear function of x. 

Suppose /' : {0, l} n+2 — > R is submodular (presumably close to /). Since / has a violated 
square {(0, 0, x), (0, 1, x), (1, 0, x), (1, 1, x)}, f must differ from / on at least one of these values. 
Fix a, b £ {0,1} such that /'(a, b, x) ^ f(a,b,x) and consider the function f'(a,b,x) — f(a,b,x) 
as a function of x. Since / is linear, /' — / is again submodular as a function of x. We have 
(/' - f)(x) / 0. If (/' - f)(x) > 0, we apply Lemma [33] to the cube {y : y > x}; if (/' - f){x) < 0, 
we apply Lemma [3761 In both cases, we conclude that there are at least 2 n / 2 values x £ {0, l} n such 
that f'(x) 7^ f{x). Therefore, / is 2 -n / 2 -far from submodular. □ 



3.3 Boosting the example to increase distance 

Observe that in Theorem 13. 3\ the relationship between relative distance and density of violated 
squares is quadratic: we have relative distance e = 2~ n / 2 and density of violated squares ~ e 2 = 2~ n . 
In order to prove the second part of Theorem ll.4[ we need to consider a denser lattice. Since the 
regions of linearity will be more complicated here, we need a more general statement to argue about 
the number of values that must be fixed to make a function submodular. 

Lemma 3.7 Let f : {0, l} n — > R be submodular (non-increasing marginals) on a down-monotone 
subset T> C {0, l} n . If /(0) > then there are at least -^-i\D\ points y £ V such that f(y) ^ 0. 

This is also tight - consider for example T> = {0,ei, . . . ,e n } and f(x) = 1 — ||x||i. 

Proof: Suppose f(y) = for some y £ T>. Then let x < y be minimal such that f{x) < 0. Since 
x is minimal (and cannot be because /(0) > 0), for any Xi = 1 we have f(x — e^) > 0. Hence 
f(x) — f(x — ej) < and by submodularity f(y) — f(y — e^) < 0. Since f(y) = 0, this implies that 
f(y — ej) > 0. In this case we call y — a witness for y. 

To summarize, for every y £ T> we have either f(y) ^ or f(y — e,) 7^ for some witness of y. 
Since every point can serve as a witness for at most n other points, the number of nonzero values 
must be at least \V\/(n + 1). □ 

Now we are ready to prove the second part of Theorem 11.41 

Proof: We define C C {0, 1}™ as follows: 

• Consider n even and partition [n] into pairs {2i — 1, 2i}, 1 < i < n/2. 

• Let C = {x G {0, 1}" : Vi; x 2 i-i = x 2 i}. 

Obviously, this is a lattice, in fact it is isomorphic to a cube of dimension n/2. The function 
/ : {0, 1}™+ 2 — y R based on this lattice has exactly 2 n l 2 violated squares, due to Lemma 13.41 It 
remains to estimate the distance of / from being submodular. 

To that end, focus on the "middle layer" of the lattice, M = {x £ C : \\x\\i = n/2}. Such points 
have exactly a half of the pairs equal to (0,0) and a half equal to (1, 1). For each such point x, 
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consider points y > x such that y still has the same number of pairs equal to (1, 1) as x. Formally, 
let 

Qx = {y > x : Vi; y 2 i-i = yu = 1 =>- x 2i -\ = x 2 i = 1}. 

The reason for this definition is that for any point y £ Q x , it is possible to trace it back to x (by 
zeroing out all the pairs which are not equal to (1, 1), we obtain x). Hence the sets Q x are disjoint. 
The path from y to x is also the shortest possible path to any point of the lattice (because it is 
necessary to modify all pair which are equal to (1,0) or (0, 1)). In other words, dc{y) = \\x — y\\ 
for any y £ Q x . This implies that the function f(a,b, y) for any fixed a, b is linear as a function of 
V £ Qx- 

Our final argument is that in order to make / submodular, we would have to fix many values in 
each set Q x . Let us assume that /' is submodular. Since / has a violated square {(0,0, x), (0, l,x), 
(1,0, x), (1,1, x)} for each x £ C, f must be different from / in at least one point in each such 
square. More specifically, /' must be larger than / for one of the points (0, l,x), (1, 0, x) or /' must 
be smaller than / for one of the points (0, 0, x), (1, 1, x). 

Fix a, b so that /'(a, b, x) differs from f(a, b, x) as above. Since / is linear on Q x , we have f' — f 
submodular on Q x and (/' — f)(a,b,x) ^ 0. If a ^ b, we must have (/' — f)(a,b,x) > 0. Then 
applying Lemma 13.71 to the set Q x — x, we conclude that f' — f must be nonzero on at least 
points in Q x . 

In the other case, a = b, we have (/' — f)(a, b, x) < 0. Note that in this case / is actually linear 
on all of {0, l} n and f' — f is submodular everywhere. Then we use arguments similar to Lemma [3.61 
Let Q~ be the set of points y < x such that the set of (0, 0) pairs is the same in y and x. Again, 
y € Q~ can be traced back to x and so these sets are disjoint. From the proof of Lemma 13.61 we 
obtain that either f{y) ^ for all y £ Q~ , or else there is an edge (x — et,x) of negative marginal 
value. This implies that all edges above this edge have negative marginal value. I.e., at least half of 
the points in Q x U (Q x — ej) must have nonzero value. 

Now let us count the size of Q x . We have n/4 pairs of value (0,0) which can be modified and 
we have 3 choices for each (we avoid (1, 1) for such pairs). Therefore, \Q X \ = 3 n//4 . The same holds 
for Q~. 

This holds for every lattice point in the middle layer Ai. Therefore, each lattice point x £ M 
contributes S7(3 n / 4 /n) nonzero points in /' — /. There are (™/J) = n(2 n / 2 /n) points in M. We have 
to be careful about the last case where the nonzero points are guaranteed to be in Q x U (Q x — ej) 
rather than Q x . Such points could be potentially overcounted n times, but we had a 1/2-fraction 
of them nonzero, so we still get J7(3 n//4 /n) nonzero points from each point in Ai. Overall, we get 
f}(2 n / 2 3 n / 4 ) nonzero points in /' — /. This means that the distance of / from being submodular is 
e = Sl(2~ n / 2 3 n/4 ). A calculation reveals that this is e ~ fi(2-°- 104n ), while the density of violated 
squares is 2~ n / 2 < e 4 ' 8 . 

Finally, it is easy to boost this example to larger value of e. Supppose we want to construct an 
example for a given n and e = 2~ ai04ra , n! < n (n' can even be a constant). Assume for simplicity 
that n = an' and a is an integer. Then we start from an example on n' coordinates where the dis- 
tance is e = 2 _0104n ' and density of violated squares is 2 -n '/ 2 . We extend / to dimension n' = an so 
that it does not depend on the new coordinates. There are no violated squares involving the new co- 
ordinates and hence the density of violated squares as well as relative distance remain unchanged. □ 
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4 Path certificates for submodular extension 



Given a partial function /, can we get a precise characterization of when / is submodular-extendable? 
Using LP duality, we can give a combinatorial condition that captures this condition. In this 
subsection, / will be some fixed partial function. We will set V = def(/) and U = B\V. Let 
us associate a variable xs for every set S. If S G T>, then xs has value f(S) (so this is not 
really a variable, but it will be convenient to keep this notation). For set S, A + (S) is the set 
{e = {S,S + i)\ i i S} and A~(S) is the set {e = (S-i,S)\ i G S}. For edge e = (S,S + i), T+(e) 
is the set {ef = (S + j,S + i + j)\ j $ S}. The set r~(e) is {ef = {S-j,S + i- j)\ j€S- »}. If / 
is extendable, then the following LP has a feasible solution. 

Ve,e'er + (e), x e - x e > > 
Ve = (S, S + i), x e - x s +i + x s > 

x > 

Using Farkas' lemma, if this is infeasible, then we can derive a contradiction from these equations. 
So, we have dual variables y e ,e',Ve associated with each equation, and the following LP is feasible. 

Ve + J2 e 'er+(e) Ve,e' = J2 e 'eT-(e) Ve',e 
Y.eeA+(S) 2/e = See A~ (S) Ve 
Ve,e' > o 

Z)seX)EeeA-(S) ~ J2e£A+(S) Velfi 3 ) < 

Definition 4.1 Consider a set of directed paths P consisting of cycles or paths with endpoints in 
T>. An edge is upward if it is directed from the smaller set to the larger, and downward otherwise. 

Let U be the multiset of upward edges o/P and D be the multiset of downward edges (so we keep 
as many copies of edge e as occurrences in P). Let G be a bipartite graph on U and D (with links, 
instead of edges). An edge e £ U is linked to e' £ D if e ■< e' . The set of paths P is matched if 
there is a perfect matching in G. 

The value of a directed path V , valiV), that starts at S G V and ends at S' G V is f(S') — f(S). 
Cycles have value 0. The value o/P is the sum of values of the paths in P. IfP has negative value, 
then P is referred to as a path certificate. 

Lemma 4.2 The partial function f is not submodular-extendable iff f contains a path certificate. 

Proof: Suppose P is a path certificate, but / can be extended to a submodular function /'. Let 
U be the multiset of upward edges in P and D the multiset of downward edges. We have a perfect 
matching between U and D. Consider a matched pair (e, e'). We have e ^ e' . By the submodularity 
of /', /'(e) > f'(e'). Considering e, e' as directed edges, we get /(e) + /'(e) > 0. Summing over all 
matched pairs, X^eeP /'( e ) — 0- Consider a path P£P. Note that valiV) is the same in / and /', 
since /' extends /. Considering V as a multiset of directed edges, we have val(V) = Yleev f'( e )- 
We get J2veP val(V) > 0. Contradiction. 

Suppose / cannot be extended to a submodular function. By Farkas' lemma, the second LP is 
feasible. Consider the directed hypercube (abusing notation, call this graph B). The second equality 
is a flow conservation constraint for all vertices in IA. Hence, we can think of the y e 's as giving a 
flow in B, where the terminals are T>. Precisely, y e is the flow in e from the lower end to the higher 



Ve, 
V5G^, 
Ve,e' G r+(e), 
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end. The first constraint is a little strangei@. Consider the graph G, where the vertices are edges of 
the hypercube, and there is a directed link from e to every member of r + (e). This actually gives n 
disconnected graphs, each of which is a hypercube in n — 1 dimensions. Think of y e ^ e > as a flow in 
G. Note that this is always positive. We do not really have a flow conservation condition, because 
of the extra y e . Add a extra terminal for every e that is attached to the vertex e £ G. This is called 
the terminal e E G. Think of y e amount of flow being removed (if y e > 0) or injected (if y e < 0) 
into e from this terminal. Then, we have a legitimate flow in G represented by the y e ,e''s. 

Since the y values are rational, we can assume that they are integral. We will construct a path 
certificate through a flow decomposition process. At an intermediate stage, we will maintain a set 
P of directed paths in B and a list of matched pairs in P. For each matched pair, we have a directed 
path in G from the smaller edge to the larger (call this set of paths Q). All these paths start 
and end at terminals in their respective graphs. We maintain the following invariants. Through 
every path in P U Q, a single unit of flow can be simultaneously routed, in the flow given by the y 
values. Furthermore, a directed edge e in P is upward iff y e > 0. Flow in any directed edge of Q is 
always positive. Suppose the current set of paths P is not completely matched. We will describe a 
procedure that either increases the number of matched pairs, or adds a new path to both P and Q. 
That means that the total flow that is routed through P (and Q) increases by one. Since the flow 
is finite, this process must terminate and return a set of matched paths. 

Suppose there is an unmatched edge e E P (wlog, we can take it to be upward). This means 
that y e is positive. Note that because P can be considered as a multiset of edges, there could be 
many copies of the upward edge e in P. Suppose there are t copies, which means that t paths in 
P pass through e. Since we can route one unit of flow in each of these path simultaneously, y e > t. 
Let us look at the situation in G. At most t—1 copies of e are matched, so there are at most t — 1 
paths in Q that end at the terminal e E G (since y e > 0, there is a net influx at terminal e E G). 
Let us route a single unit of flow through all paths in Q (and remove this flow). This must still 
leave at one unit of flow going into e. So, we can route one unit of flow from some e' to e along path 
Q. Note that because the flow is always positive in G, e' y e. 

Note that y e i < 0, because in G, the terminal e' has a net outflow. Suppose there is an unmatched 
copy of e' in P (it must be downward). Then we can match e to this copy of e', and we are done. 
Suppose this is not the case. Let s be the number of copies of the downward edge e' in P (all of 
these are matched). We argue that s < \y e '\- Suppose, for the sake of contradiction, that \y e /\ = s. 
Them, there are s paths in Q that start at the terminal e' E G. If we remove all the flow paths 
corresponding to Q, then there is no flow going out of e' . But, we were able to route one unit of 
flow from e' to e along Q after removing flow corresponding to Q. Contradiction. Hence \y e >\ > s. 
This means that after removing all the flow corresponding to P (in B), there is still at least one unit 
of (downward) flow left on e' . So, after the removal, we can still route one unit of flow through e', 
giving us path or cycle P. We add P to P and Q to Q, observing that the invariants are maintained. 
This ends the procedure. 

Finally, we end up with a set of matched paths P. If this has negative value, we have found our 
certificate. Suppose it has positive value. We argue that the we can find a new (integral) solution 
for the dual which has a smaller flow. This is done by just removing one unit flow along all paths in 
the final P and Q. Consider some upward edge in P. Since P is completely matched, the number 
of copies of e in P is exactly the number of paths in Q ending at terminal e in G. Hence, the y 
values, after the decrease, will maintain the flow conservation conditions. The original value of the 
solution is negative, and we removed a set of matched paths of positive value. So, the value of the 
remaining solution is still negative. This gives us the new solution for the dual. □ 

4 By that we mean, somewhat different, and not an unknown dwarf. 
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A path in P is called a singleton if it consists of only a single edge. We will prove some "clean-up" 
claims that provide us with nice path certificates. 

Claim 4.3 Let f be a partial function. Let f contain a set of matched paths P and let e is an 
upward edge in P that is matched to a downward copy of itself. There is an operation that converts 
P to P' such that P' contains the same multiset of edges P except for an upward and downward copy 
of e. The matching ofP' is identical to P (except for the matched pair of e) and val(P) = val(P'). 

Proof: Let e = (£, S + Suppose path V u contains edge e upwards, and Vd contains it down- 
wards. We can split V u into portions V\ jU and Vi,u such that the former is the part before e and the 
latter is after e. Similarly, we can get V\,d and Vi,&. Note that V\ yU ends at S and V24 starts at S. 
Similarly, V2, u ends at S + i and V\ t d starts at S. We can combine V\ tU an d V24 to get a path V[. 
Similarly, we get V^- We replace V u and Vd by he V[ and V 2 - Note that the sum of values does not 
change. Also, the only edges removed are the upward and downward copies of e and the matching 
on the remaining edges stays the same. □ 

Claim 4.4 Let f be partial function such that for any square of B, at most 2 points are present in 
def(f). Let f contain a path certificate P, such that no edge occurs both upward and downward in 
P. There exists a path certificate Q such that Q contains no singleton edge. Furthermore, no edge 
in Q appears both upward and downward. 

Proof: We will show how to remove any singleton in P and give an "equivalent" certificate Q. 
The value will remain the same. Suppose there is a singleton path consisting of upward edge e. 
Some downward edge e', e' >z e must occur in path V £ P. If e = e', then this edge occurs both 
upward and downward. This cannot happen. So e' >- e. Let e = (S, S + i) and e' = (T + i, T), for 
some S C T. We will split V into two paths. Let V\ be the portion of V before e' and V2 be the 
portion after e. Note that V\ ends at T + i and V2 starts at T. Consider a downward path Q\ from 
T + i to S + i and a parallel upward path Q2 from S to T. Observe that there is a perfect matching 
between the edges of Qi to those of Q,2- 

Consider the path Q[ formed by joining V\ to Qi, and the similarly constructed Q' 2 . Note that 
Q'l ends at S + i and Q' 2 starts at S. To get Q, we remove the singleton e from P and replace 
V by V\ and V2- The set Q is completely matched. The edges in Q\ and Q2 (matched to each 
other) are disjoint. Hence, no edge in Q appears both upward and downward. The singleton edge 
e starts at S and ends at S + i. So val(Qi) + val(Q' 2 ) = val{e) + val(V). and val(Q) = val(P). 
Suppose \Qi\ > 1. Then neither of Q\ and Q' 2 are singletons. Suppose Q\ is a single edge. Then 
e and e' form a square, so neither endpoint of e can be in def(/). This means that the path 
V\ and V2 are at least of length 1 and Q\ and Q 2 are at least of length 2. The total number of 
singletons has decreased by 1. We can repeatedly apply this procedure, and remove all singletons. □ 

4.1 Large minimal certificates 

This will require many steps. We will start by giving a construction of a long cycle in B with some 
special properties. This cycle will be a sort of "frame" on which we can define /. For this /, we will 
find a set of matched path of negative value, showing that / is non-extendable. 

The simple cycle will be obtained by performing a series of moves in B. An upward (resp. 
downward) step is one where some coordinates is incremented (resp. decremented). We will assume 
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that n = 2m + 4. The cycle will only involve points in the m + l,m + 2,m + 3,m + 4 levels of B. 
We will call these levels the 1,2,3,4 levels. Any point is represented as (b\, b 2 , &3 5 &4, 5, T), where 
&i's are bits, and 5 and T are sets on m elements. We will denote the starting (and hence, ending) 
point of the cycle to be (0, 0, 1, 0, 0, [m]), where [m] represents the complete set on m elements. The 
cycle C has the following properties: 

• The cycle is simple, i.e., does not intersect itself. 

• The cycle can be divided into a sequence of contiguous chunks of three steps. Every odd (resp. 
even) chunk has three upward (resp. downard) steps. There are an even number of chunks. 

• The cycle has M > 2 m chunks. 

• Let the ith chunk is denoted by 2Q. The second edge e of 2Q is parallel to the first edge e' of 
^Q+i(mod M) ■ Suppose i is odd. Then Ki has upward steps, and hence e' >- e. Similarly, if i is 
even, e' ~< e. 

A crucial combinatorial property of the hypercube that we use is the existence of Hamiltonian 
circuits. We set H to be a (directed) Hamiltonian circuit on the m-dimensional hypercube. For 
any set R € H, s(R) denotes the successor of R in T~L. The complement path T~L is the Hamiltonian 
circuit obtained by taking the set-complement of every point in %. 

Lemma 4.5 There exists a cycle C with the properties above. 

Proof: Starting from a point (0, 0, 1, 0, R, R), we will give a sequence of 4 chunks that will end at 
(0, 0, 1, 0, s(s(R)), s(s(R))). Since % is a Hamiltonian circuit, we get a cycle. The reason we keep R 
and R is that from (• • • ,R,R), we can perform a single upward and then downward step to reach 
(• • • , s(R), s(R)). We will assume that the moves to both s(R) and s(s(R)) are upward. Whenever 
this is not the case, we can just reverse the roles of R (or s(R)) and R (or s(R)). 

We describe the sequence of chunks. In the arrows below, the labels above them represents the 
coordinate being changed. The numbers 1,2,3,4 represent the first four coordinates. If the label 
has a set, then that set is being changed by moving along (appropriately) either H or %. These 
labels help verify the matching property. The first and third chunks only have upward steps, and 
the remaining have only downward steps. For convenience, 5 = s(R) and T = s(S). 

1. (0,0,1,0,22,7!) \ (1,0,1,0,^,7?) A (1,1,1,0,22,5) 4 (1,1, 1,0, 5,7?). 

2. (1,1,1,0,5,7!) A (1,0,1,0,5,7!) A (1,0,0,0,5,7!) 4 (1,0,0,0,5,5). 

3. (1,0,0,0,5,5) A (1,0,1,0,5,5) A (1,0,1,1,5,5) A (1, 0, 1, 1, T, 5). 

4. (1,0, 1,1, T,5) A (1,0,1,0,T,5) A (0,0,1,0,T,5) A (0, 0, 1, 0, T,T). 

It is easy to see that no point can occur in two different chunks, because the sets on % or % 
are different. So, the cycle is simple. The number of chunks is at least the number of points in the 
m-dimensional hypercube. The matching property should be clear. □ 

We now define the function /. Let the directed path consisting of the first two edges of chunk 
Ki be Vi- Note that V21 is downward and Vn+\ is upward. We describe the function / and state 
many properties of def(/). It will be convenient to have define the following sequences of 4 bits. We 
set B 1 = (0, 0, 1, 0), B 2 = (1, 0, 0, 0), d = (1, 1, 1, 0), and C 2 = (1, 0, 1, 1). We use A to denote any 
one of these. 
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• The function / will be defined on all the endpoints of the Vi's. 

• For Vi, the small endpoint has value v (the exact choice for this is immaterial), and the larger 
endpoint has value v + 1. For Vn+i (i > 0), the small end has value v and the large end has 
value v + 2. For Vn (Vi), the large end has value v + 2 and the small end has value v. 

• Fix any R. One and only one point of the form (Bj,R,R) is present in def(/). Similarly, 
one and only one of (Cj,R,R) is present in def(/). We also have (Bj,R,R) £ def(/) iff 
(Cj,R,R) £ def(/). No other point is present in levels 1 and 3. 

• Fix any R. Suppose s(R) D R. One and only one of (Bj, s(R), R) is present in R. Similarly, 
one and only one of (Cj, s(R), R) is present in R. We also have (Bj, s(R), R) £ def(/) iff 
(Cj, s(R), R) £ def(/). No other point is present in levels 2 and 4. 

Suppose s(R) C R. Then these points are of the form (A, R, s(R)). 

• Pairs of neighbors in def(/) are either level 1-level 2 pairs, or level 3-level 4 pairs. They 
are always of the following form: (A, R,R) -> (A, s(R),R) (if R C s(R)) or (A,R,R) -> 
(A,R,~s(R)) (if R D s(i?)). 

• For any point of def(/), there is at most one neighbor present in def(/). Hence, any square of 
B contains at most 2 points of def(/). 

• Consider some point (Bj, R, R) in level 1. The only point in level 3 at a Hamming distance 2 
from this point is (Cj,R,R). A similar statement holds for points in level 2. 

Claim 4.6 The function f is not submodular- extendable. 

Proof: By Lemma 14.21 if suffices to show a path certificate. As the astute reader might have 
guessed, all the TVs form such a set. A matching exists because of the fourth property of the cycle 
C. The value of P\ is 1. The value of any other P2i+i is 2. Every P<n has value —2. Since the total 
number of chunks is even, the value of this set of paths is —1. □ 

We will now show that /|g for any S C def(/) is extendable. It will be easiest to show that by 
proving that any path certificate for / must essentially be the Pj's. 

Claim 4.7 Suppose f contains a set of matched paths P with no singletons. This P must be the 
set of all Vi 's. 

Proof: Consider a point X in P that lies in the lowest level (the number of Is in the representation 
of the point is minimized). We argue that this point only has upward edges incident to it. If there 
is a downward edge e incident to it, then P must contain an upward edge e' that is matched to 
e. Therefore, e' -< e and the lower end of e' must lie in a lower level than S. This contradicts the 
choice of S. Hence, X only has upward edges incident to it. This means that it can never be in the 
interior of a path, and must be a terminal. Therefore, X £ def(f). Similarly, points in P that lie in 
the highest level only have downward edges incident to them, and are also in def(/). 

The points of def(/) lie in levels m + 1, m + 2, m + 3, m + 4, called the 1, 2, 3, 4 levels. Edges 
between the 1 and 2 levels are called low edges, those between the 2 and 3 levels are middle edges, 
and those between the 3 and 4 levels are high edges. All edges of P fall into one of these three sets. 
Low edges are always upward and high edges are always downward. Middle edges are matched to 
either low or high edges. Therefore, the number of middle edges is exactly the same as the total 
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number of low and high edges. Since P contains no singletons, every path must contain at least one 
middle edge. The total number of low and high edges in a path is at most 1. This implies that every 
path in P has exactly two edges and has one of the two forms: an upward low and middle edge, or 
a downward top and bottom edge. The former paths go from level 1 to level 3 and the latter from 
level 4 to level 2. We must have at least one path of each type to get both upward and downward 
edges. Therefore there is some level 1 point of def(/) in P. 

Consider some point X = (0, 0, 1, 0, R, R) at level 1 that is a terminal in P. Let path Q 6 P start 
from here. Note that this is the endpoint for some Vi, which is (0, 0, 1, 0, R, R) — > (1, 0, 1, 0, R, R) — > 
(1, 1, 1, 0, R, R). The certificate P has an upward path of length 2 from X. The properties of def(/) 
tells us that the other end of Q can only be (1, 1, 1, 0, R, R). It does not immediately follow that Q 
is Vi, since there are two different paths between these points (the endpoints differ in coordinates 
1 and 2). But observe that the second edge of Q must be matched by an downward edge between 
levels 4 and 3. This edge has an endpoint in level 4 that must be a neighbor of (1, 1, 1, 0, R, R). By 
the properties of def(/), this point must be (1, 1, 1, 0, s(R),R) (assuming s(R) D R). All downward 
paths of length 2 from this point end at (1, 0, 0, 0, s(R),R). The path changes in coordinates 2 and 
3. Since the second edge of Q is matched to the first edge of this path, both of these edges must be 
along coordinate 2. Hence, Q is Vi, and 'Pj+i( mo dAf) a l so nes m P- Repeating the argument, we get 
that all Vi's lie in P. This completes the proof. □ 

Proof: (Theorem II. 7j) By Claim 14.61 the function / is not submodular-extendable. For some 
subset A C def(/), suppose f\^ is not submodular-extendable. Since def(/) contains no squares, 
by Claim [4^41 there is a path certificate P in def(/|^i) that contains no singletons. Note that P is 
also a path certificate for /. By Claim I4T71 P contains all Vis. But that means that P contains all 
points in def(/). Contradiction. □ 



5 From monotonicity to submodularity 

In this section, we show a simple reduction from testing monotonicity to testing submodularity. 

Lemma 5.1 Given f : {0, 1}™ — > M, there exists a function g : {0, l} n+1 — > R with the following 
properties: 

• If f is monotonically non-increasing, then g is submodular. 

• If f is e-far from being monotonically non-increasing, then g is e/ 2- far from being submodular. 

• The value g{x) can be computed by looking at 2 values of f. 

Proof: We will use small letters x,y, etc. to denote points in {0, l} n . Points in {0, l} n+1 will be 
denoted by (0, x) or (l,x), where the first bit denotes the absence or presence of the new element. 
We use e* to denote the unit vector corresponding to the new element, and e%,ej to denote the 
other unit vectors. For convenience, monotone will mean monotonically non-increasing. Define 
h{x) = /(0)||x||i(n — ||x||i). We define g by the following: g(0, x) = h(x), and g(l, x) = f{x) + h(x). 
So any value of g can be computed by looking at 2 values of /. 

We first show that h is submodular. Consider x and i,j such that Xi = Xj = 0. Let = r 
and /(0) = M. 

h(x + ej) + h{x + &j) — h{x + ej + ej) — h(x) 
= M[2(r + l)(n - r - 1) - r(n - r) - (r + 2)(n - r - 2)] 
= M[(2nr - 2r 2 - 2r + 2n - 2r - 2) - nr + r 2 - nr + r 2 + 2r - 2n + 2r + 4] 
= M[(2nr - 2r 2 + 2n - 4r - 2) - (2nr - 2r 2 + 2n - 4r - 4)] = 2M 
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Hence h is submodular. 

Assume that / is monotone. Then, for any x, f(x) < /(0) = M Since f(x + ej) + f(x + ej) — 
f(x + ej + ej) — f(x) < 2M, / + h is also submodular. 

Suppose g is not submodular. Then there exists a violated square in g. Suppose this square 
does not involve e*. This square is contained in a copy of {0, l} n where the function is equal to h or 
f + h. But this would imply that either h or / + h is non-submodular. So, this square must involve 
e*. Then we have the following: 

< g(0, x) + g(l, x + e,)- g(0, x + ej) - g(l,x) = f(x + e;) - f(x). 

This violates the non-increasing property of /. Hence, we conclude that g is submodular. 

Now, suppose that / is e-far from being monotone. Furthermore, suppose we can modify e2 n 
values of g to get a submodular function g' . Consider the function f'(x) = g'(l, x) — g'(x). Since g' 
is submodular, /' must be monotone. Since g' differs from g in at most e2 n values, the monotone 
function /' differs from / in at most e2 n values. This is a contradiction. So, g must be e/2-far from 
being submodular. □ 

By the results in |FLN + 02] , there is an Vl{y/n) non-adaptive and O(logn) lower bound for 1-sided 
monotonicity testers. We get the following corollary. 

Corollary 5.2 Any non-adaptive 1-sided tester for submodularity requires £l(y/n) queries. Any 
adaptive 1-sided tester requires f2(logra) queries. 



6 Future work 

All of this work is centered on the following very general question: what really makes a function 
submodular? Of course, it is "just" monotonicity of marginal values, but this does not capture the 
full structure of submodular functions. We want to understand how different sets of values in a sub- 
modular function interact and influence each other. The problem of property testing submodularity 
appears to be a very appealing way of studying this question. Our constructions show that functions 
far from being submodular could have marginal values that are much closer to being monotone. 

The problem of completing partial functions comes up when we try to understand how to con- 
vert a non-submodular function into a submodular one (a major component of a property testing 
proof). Again, our constructions yield insight into how seemingly unconnected parts of a submodular 
function must be related. 

The authors believe there is a lot of scope for further research directions. There are many 
interesting questions to be answered, and we have barely seen the tip of the iceberg. We state some 
questions here. 

1. Relation between violated squares and distance to submodularity: For a function / e-far from 
being submodular, what is the minimum (as a function of e and n) density of violated squares it 
can have? Can we prove that this minimum density is at least poly(e/n)? 

2. Efficient testers for submodularity: Does there exist a tester for submodularity with running 
time poly(ra/e) or maybe poly(n) for constant e? Perhaps we can find an efficient adaptive tester, 
or a tester that searches for obstructions other than violated squares. 

3. Testing rank functions: A matroid gives rise to a rank function, which is always submodular. 
A function is a rank function iff it is a submodular function with marginal values or 1. Can we 
test whether an input function / is a rank function? Note that even though these are a special case 
of submodular functions, it is not clear that this is easier (or harder). This is because the distance 
to a rank function might be significantly different from the distance to submodularity. 
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4. Testing matroid independence oracles: Any matroid can be represented as a collection of 
independent sets. Suppose we have a function that tells us whether a set is independent (for some 
purported matroid). Can we efficiently test whether this function is indeed a valid independence 
oracle? This seems like a rather fundamental question about matroids. 

Acknowledgement. We thank Deeparnab Chakrabarty for very useful discussions. Indeed, the 
main question whether submodularity is testable came up during discussions with him. 
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